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Abstract 

This paper presents the first combinatorial polynomial-time algorithm for min- 
imizing submodular set functions, answering an open question posed in 1981 by 
Grotschel, Lovasz, and Schrijver. The algorithm employs a scaling scheme that 
uses a flow in the complete directed graph on the underlying set with each arc ca- 
pacity equal to the scaled parameter. The resulting algorithm runs in time bounded 
by a polynomial in the size of the underlying set and the largest length of the func- 
tion value. The paper also presents a strongly polynomial-time version that runs 
in time bounded by a polynomial in the size of the underlying set independent of 
the function value. 
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1. Introduction 



Grotschel, Lovasz, and Schrijver |]14| revealed the polynomial-time equivalence between 



the optimization and separation problems in combinatorial optimization via the ellipsoid 
method. Since then, many combinatorial problems have been shown to be polynomial- 
time solvable by means of their framework. The problem of minimizing submodular (set) 
functions is among these problems. Since the ellipsoid method is far from being efficient 
in practice and is not combinatorial, efficient combinatorial algorithms for submodular 
function minimization have been desired for a long time. 

A function / on all the subsets of a finite set V is called submodular if it satisfies 

/(X) + f{Y) > f{X UY) + f{X n Y) VX, YCV. 

We suppose that /(0) = without loss of generality throughout this paper. 

Submodular functions arise in various branches of mathematical engineering such as 
combinatorial optimization and information theory. There are also close connections 



between submodularity and convexity Examples include the matroid rank 

function, the cut capacity function, and the entropy function. In each of these and other 
apphcations, the base polyhedron defined by 

B(/) = {x I X e R^, x{V) = f{V), WXCV: x{X) < f{X)} (1.1) 

often plays an important role, where x{X) = J2vex ^i'^) for any X (1 V. 

Linear optimization problems over base polyhedra are efficiently solvable by the greedy 
algorithm of Edmonds [Q. Thus Grotschel, Lovasz, and Schrijver assert that the 



submodular function minimization, which is equivalent to the separation problem, is 
solvable in polynomial time by the ellipsoid method. Later, they also devise a strongly 



polynomial-time algorithm within their framework using the ellipsoid method . 

A first step towards a combinatorial strongly polynomial-time algorithm was taken by 
Cunningham [0, ^, who devised a strongly polynomial-time algorithm for testing mem- 
bership in matroid polyhedra as well as a pseudopolynomial-time algorithm for minimiz- 
ing submodular functions. Recently, Narayanan improved the running time bounds 
of these combinatorial algorithms by a rounding technique. Based on the minimum-norm 
base characterization of minimizers due to Fujishige pD| , |TT|], Sohoni ||2^ gave another 



combinatorial pseudopolynomial-time algorithm for submodular function minimization. 

For the problem of minimizing a symmetric submodular function over proper nonempty 
subsets, Queyranne presented a combinatorial strongly polynomial-time algorithm. 



extending the undirected minimum cut algorithm of Nagamochi and Ibaraki p2 |. 

In this paper, we present a combinatorial polynomial-time algorithm for submodular 
function minimization. Our algorithm uses an augmenting path approach with reference 
to a convex combination of extreme points of the base polyhedron. Such an approach 
was first introduced by Cunningham for minimizing submodular functions that arise 
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from the separation problem for matroid polyhedra 0. This was adapted for general 
submodular function minimization by Bixby, Cunningham, and Topkis and improved 
by Cunningham to obtain a pseudopolynomial-time algorithm. 

A fundamental tool in these algorithms is to move from one extreme point of the 
base polyhedron to an adjacent extreme point via an exchange operation that increases 
one coordinate and decreases another coordinate by the same quantity. This quantity 
is called the exchange capacity. These previous methods maintain a directed graph on 
the underlying set that represents the possible exchange operations. They are inefficient 
since the lower bound on the size of each augmentation is too small. In traditional 
network flow problems, it is possible to surmount this difficulty by augmenting only on 
paths of sufficiently large capacity [^. However, it has proved difficult to adapt this 
scaling approach to work in the setting of submodular function minimization, mainly 
because the amount of augmentation is determined by exchange capacities multiplied by 
the convex combination coefficients. These coefficients can be as small as the reciprocal 
of the maximum absolute value of the submodular function. 

To overcome this difficulty, we augment the directed graph corresponding to allowable 
exchanges with the complete directed graph on the underlying set, letting the capacity of 
this additional arc set depend directly on our scaling parameter. This technique was first 
introduced by Iwata [jl9| , who used it to develop the first polynomial-time capacity-scaling 
algorithm for the submodular flow problem of Edmonds and Giles 0. This algorithm was 
later reflned by Fleischer, Iwata, and McCormick |^ into one of the fastest algorithms for 
submodular flow. Our work in this paper builds on ideas in this latter paper to develop 
a capacity-scaling, augmenting-path algorithm for submodular function minimization. 
The running time of the resulting algorithm is weakly polynomial, i.e., bounded by a 
polynomial in the size of the underlying set and the largest length of the function value. 
Even under the similarity assumption that the largest function value is bounded by a 
polynomial in the size of the underlying set, our algorithm is faster than the best previous 
combinatorial, pseudopoljTiomial-time algorithm 0. 

We then modify our scaling algorithm to run in strongly polynomial time, i.e., in 
time bounded by a polynomial in the size of the underlying set, independently of the 
largest length of the function value. To make a weakly polynomial-time algorithm run 
in strongly polynomial time, Frank and Tardos [|| developed a generic preprocessing 
technique that is applicable to a fairly wide class of combinatorial optimization problems 
including the submodular flow problem and testing membership in matroid polyhedra. 
However, this framework does not apply to submodular function minimization. Instead, 
we devise a combinatorial algorithm that repeatedly detects an element that belongs to 
every minimizer or an ordered pair of elements with the property that if the flrst belongs 
to a minimizer then the second does. 

There are some practical problems, in dynamic flows [0, facility location [^, and 
multi-terminal source coding P, [1^, where the polynomial-time solvability relies on a 
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submodular function minimization routine. Goemans and Ramakrishnan |13| discussed 
a class of submodular function minimization problems over restricted families of subsets. 
Their solution is combinatorial modulo an oracle for submodular function minimization 
on distributive lattices. Our algorithm can be used to provide combinatorial, strongly 
polynomial-time algorithms for these problems. 

This paper is organized as follows. Section ^ provides preliminaries on submodular 
functions. Section ^ presents a scaling algorithm for submodular function minimization, 
which runs in weakly polynomial time. Section ^ is devoted to the strongly polynomial- 
time algorithm. Finally, we discuss extensions in Section ^ 



2. Preliminaries 

We denote by Z and R the set of integers and the set of reals, respectively. Let V be 
a finite nonempty set of cardinality \V\ = n. For a vector x G we define a modular 
function x : 2^ — R by x{X) = J^vev^i'")- ^ach u E V, we denote by Xu the unit 
vector in R^ such that Xu{v) = 1 if v = u and = otherwise. 

Given a submodular function / with /(0) = and its associated base polyhedron B(/) 
as defined in (|T]T|), we call a vector x G B(/) a base. An extreme point of B(/) is called 
an extreme base. A fundamental step in submodular function minimization algorithms 
is to move from one base x to another base x' via an exchange operation that increases 
one coordinate while decreasing another coordinate by the same amount. The maximum 
amount of increase that ensures x' G B(/) is called the exchange capacity. More precisely, 
for any base x G B(/) and any distinct u,v eV the exchange capacity is 

c(x, u, v) = max{a | a G R, x + a{xu — Xv) G B(/)}. (2.1) 

The exchange capacity can also be expressed as 

c(x, u, v) = min{/(X) - x{X) | m G X C V\{v}}. (2.2) 

In general, computing c(x, u, v) is as hard as submodular function minimization, even 
when X is an extreme base. However, if x is an extreme base, then for special pairs of 
vertices u and v, the exchange capacity c{x, u, v) can be computed with one function 
evaluation as follows. 

Let L = (t>i, f2, ■ ■ ■ , f„) be a linear ordering of V. For any k G {1,2, ■■■,n}, we 
define L{vk) = {f i, f2, ■ ' ' ? "^fc}- Given such a linear ordering, the greedy algorithm of 
Edmonds computes 

y{v,) = f{L{v,)) - f{L{v,^i)) (^ = 1, 2, ■ ■ ■ , n), (2.3) 

where L{vo) = 0. The resulting vector y G R^ is an extreme base y G B(/). Conversely, 
any extreme base can be generated by applying the greedy algorithm to an appropriate 
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linear ordering. Note that a linear ordering L = {vi,V2, ■ ■ ■ ,Vn) generates an extreme 
base y if and only if y{L{vi)) = f{L{vi)) for i = 1,2, ■ ■ - ,71. For any base y G B(/), a 
set X C is called y-tight if y{X) = f{X). A pair {u,v) is called eligible for y ii u 
immediately succeeds v in some linear ordering that generates y. The following lemma 
enables us to compute an exchange capacity c{y, u, v) if (m, v) is eligible for y. 

Lemma 2.1: Let L be a linear ordering of V that generates an extreme base y G B(/). 
Let L' be the linear ordering obtained by interchanging u and v that are consecutive in L. 
Then the extreme base y' generated by L' satisfies 

y' = y + l^{xu-xv) (2.4) 

with 

P = f{L{u)\{v})-f{L{u)) + y{v). (2.5) 
Moreover, we have c{y, u, v) = (3. 

Proof. Equations ( ^.4[ ) and ( ^.5] ) follow from the greedy algorithm (see ( |2.3| )). By the 
definition (|2Tl| ) of the exchange capacity, we have (3 < c{y, u, v). Since y{L{u)) = f{L{u)), 



it follows from (|2.2| ) and ( p3|) that (3 > c{y,u,v). Thus we obtain (3 = c{y,u,v). ■ 

We will use Lemma to transform one extreme base into another and to update 
the corresponding linear ordering. 

For any vector x G R^, we denote by x~ the vector in defined by x~{v) = 
min{0, x{v)} for v & V. The following fundamental lemma easily follows from a theorem 
of Edmonds [|] on the vector reduction of polymatroids (see []T2|, Corollaries 3.4 and 3.5]). 



Lemma 2.2: For a submodular function / : 2^ ^ R tt;e have 

max{x~(\/) I X G B(/)} = min{/(X) | X C y}. 
// / is integer-valued, then the maximizer x can be chosen from among integral bases. ■ 
We will not use the integrality property indicated in the latter half of this lemma. 



Lemma |2.2| shows a min-max relation of strong duality. A weak duality is described as 
follows: For any base x G B(/) and any X C \/ we have x~(y) < f{X). We call the 
difference f{X) — x'iV) a duality gap. Note that, if / is integer- valued and the duality 
gap f{X) — x~{y) is less than one for some x G B(/) and X C V, then X minimizes /. 
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3. A Scaling Algorithm 



In this section, we describe a combinatorial algorithm for minimizing an integer-valued 
submodular function / : 2^ — > Z with /(0) = 0. We assume an evaluation oracle for the 
function value of /. Let M denote an upper bound on |/(X)| among X <ZV . Note that 
we can easily compute M by 0(n) calls for the evaluation oracle as follows. Let y be an 
extreme base generated by a linear ordering L. For any X C V^, we have y~{y) < y{X) < 
fix) < J2 max{0, f{{v})}. Thus we obtain M = max{\y-{V)\, ^ max{0, fi{v})}}. 

3.1. Algorithm Outline 

As indicated earlier, our algorithm uses an augmenting path approach to submodular 
function minimization [|l], @, @]. As with previous algorithms, we maintain a base x G 
B(/) as a convex combination of extreme bases yi G B(/) indexed by z G /, so that 
X = KUi- Roughly speaking, these previous algorithms use a directed graph with 
the arc set defined by the pairs of vertices that are eligible for some z G /. They seek 
to increase x~{y) by performing exchange operations along a path of arcs from vertices 
s with x{s) < to vertices t with x{t) > 0. The algorithms stop with an optimal x when 
there are no more augmenting paths. The corresponding minimizer X is determined by 
the set of vertices reachable from vertices s with x{s) < 0. 

To adapt this procedure to a scaling framework, we use a complete directed graph 
on V with arc capacities that depend directly on our scaling parameter 6, an idea first 
introduced for submodular flows in |T^. Let : V x V —>■ Ti he skew- symmetric, i.e.. 



(p{u,v) + (p{v,u) = for u, i; G V, and 5-feasible in that it satisfies capacity constraints 
—5 < ip[u,v) < 6 for every u,v & V. The function ip can be regarded as a flow in the 
complete directed graph G = (V, E) with the vertex set V and the arc set E = V x V. 
The boundary dip : V —>■ H of ip is defined by 

dv{v) = J2^{u,v) (veV). (3.1) 

uev 

Instead of trying to maximize x~{V) directly, we define z = x — dip. Our algorithm seeks 
to maximize z~{V) and thereby increases x~{V). 

We also maintain linear orderings Lj for i & I and extreme bases y^ generated by 
them. We start with an arbitrary linear ordering L onV and the extreme base x G B(/) 
generated by L. In addition, we start with the zero flow ip = 0. Thus, initially z^{V) = 
x~{V) > —nM. We seek to increase z~{V), and in doing so, obtain improvements in 
x~{V), via the (5-feasibilty of ip. 

The algorithm consists of scaling phases with a positive parameter 6. It starts with 
6 = M, cuts S in half at the beginning of each scaling phase, and ends with 6 < 
Each 5-scaling phase maintains a (5-feasible flow ip, and uses the residual graph G{ip) = 
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(y, E{ip)) with the arc set 



E{(p) = {{u,v) \ u,v E V, u V, ip{u,v) < 0}. 



(3.2) 



Intuitively, E{(f) consists of the arcs through which we can augment the flow (p hj S 
without violating the capacity constraints. 

A 5-scaling phase starts by preprocessing (p to make it ^-feasible, and then repeatedly 
searches to send flow along augmenting paths in G{(f) from S := {v \ v E V, z{v) < —6} 
to T := {v \ V & V, z{v) > 6}. Such a directed path is called a 6-augmenting path. 

If there are no 5-augmenting paths, then the algorithm checks whether there is a pair 
(m, v) of vertices such that u is reachable from S* by a path with residual capacity > 6, 
V is not, and u immediately succeeds w in a linear ordering that generates yi for some 
i & I. We perform the appropriate exchange operation, and modify by creating residual 
capacity on {u, v) so that z = x — dip is invariant. This operation may increase the set of 
vertices reachable from S on paths of residual capacity > 6. Once a (5-augmenting path 
is found, the algorithm augments the flow (p hj 6 through the path without changing 
X. As a consequence, z~{V) increases by 6 in one iteration. This is an extension of a 
technique for handling exchange capacity arcs in submodular flows first developed in [0 . 

3.2. Algorithm Details 

We now describe the scaling algorithm more precisely. Figure ^ provides a formal de- 
scription. 

At the beginning of the 5-scaling phase, after 6 is cut in half, the current flow (p is 
2(5-feasible. Then the algorithm modifies each ip{u,v) to the nearest value within the 
interval [—6, 6] to make it 5-feasible. This may decrease z~{V) for z = x — dip by at most 



{^jS- The rest of the (5-scaling phase aims at increasing z~{V) by augmenting flow along 
o-augmenting paths. 



Let W denote the set of vertices reachable by directed paths from S in G{ip). For 
each i G / we keep a linear ordering Lj that generates i/i. We call a vertex v G 
active if v is the last vertex in Li among vertices in V\W that satisfies W\Li{v) ^ 0. 
If V is active in Lj, we call (i, v) an active pair. We denote by Z the set of the current 
active pairs. 

If n T = 0, there is no 5-augmenting path in G{ip>). Then, as long as there is an 
active pair (i, t>), i.e., 2' 7^ 0, the algorithm repeatedly picks an active pair (i, f ) G Z and 
applies Push(i, -u, v) to u that succeeds v in Li. Note that v active implies that u G W . 

The operation Push(i,M,t>) starts with reducing the flow through {u.,v) by a = 
min{(5, Ajc(?/j, M, t>)}. The boundary dip moves to dip + — X^- The operation 

Push(i,-u,f) is called saturating if a = Ajc(?/j, m, f ). Otherwise, it is called nonsatu- 
rating. A nonsaturating Push(i,M,f) adds to J a new index k with Uk '.= Vi, Xk '■ = 
\i — a/c{yi, u, v), and L^ := L^. Whether the Push(i, u, v) is saturating or not, it updates 
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SFM(/): 
Input : f -.2^ 

Output : X C.V minimizing / 

Initialization: 

Li <— an linear ordering on V 

a; ^ an extreme base in B(/) generated by Lj 

I ^ {i}: Vi ^x, Xi^ 1, 

5^ M 
While S>l/n^ do 

For {u, v) e E do 

If ip{u,v) > 5 then (p{u,v) ^ S 

If (p{u, v) < —S then ip{u, v) < S 

S^{v\ x{v) < d(p{v) - 6} 
{v\ xlv) > d(p{v) + S} 
W ^ the set of vertices reachable from S in G{lp) 
Z <— the set of active pairs {i, of i e / and v & V 
While 5 7^ 0, T 7^ and Z 7^ do, 
While n T = and Z ^ do, 
Find an active pair (i, v) e Z. 
Let u be the vertex succeeding v in Li . 
Apply Push(i, u, v). 
Update W and Z. 
IfWDTy^d) then 

Let P be a directed path from to T in G{(p). 
For (m, G P do Lp{u, v) ip{u, v) + 5, (p{y, u) (p{y, u) — 5 
Update 5, T, W , and Z. 
Express 

= SiG/ ^iUi by possibly smaller affinely independent 
subset / and positive coefficients Aj > for i e 7. 
If 5 = then X = else if T = then X = F else X 
End. 



Figure 1: A scaling algorithm for submodular function minimization. 
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Push(i, rt, v): 

a ^ min{5, Xic{yi,u,v)} 
ip{u, v) ^ ip{u, v) — a 
(fi{v, u) <— (p{v, u) + a 
If a < Xic{yi, u, v) then 

k <— a new index 

I ^lU{k} 

Afc ^ Aj - a/c{yi, u, v) 

Xi ^ a/c{yi,u,v) 

Vk ^ Vi 

Lk ^ Li 
yi yi + c{yi, M, v) ixu - Xv) 
Update Li by interchanging u and v. 



Figure 2: Algorithmic description of the operation Push{i,u,v). 

yi as yi := yi + c{yi, u, v){xu - Xv)-, \ ■= a/c{yi, u, v) if c{yi, u, v) > 0, and Li by inter- 
changing u and V. Then the current base x moves to x + a{xu — Xv)- Thus z = x — dip 
is invariant. 

Each time the algorithm applies the push operation, it updates the set W of vertices 
reachable from S in G{(f) and the set Z of active pairs. If Push(i, u, v) is nonsaturating, it 
makes v reachable from S in G{lp), and hence W is enlarged. Once v becomes reachable 
from 5* in G{ip), it will never become active again for any i E I until the algorithm 
finds a 5-augmenting path or all the active pairs disappear. Note that we encounter at 
most n nonsaturating pushes before we find a 5-augmenting path or all the active pairs 
disappear. Each time the algorithm picks an active pair and applies Push(i,M,v), 
the vertex v shifts towards the end of Lj. Hence the algorithm picks an active pair 
at most n times before v enters W or W\Li{v) = 0. At this point v becomes inactive, 
and remains inactive until the next augmentation. Hence, for each i e / the total time 
required for processing active vertices in Lj is O(n^). 

We note that we could relax the definition of an active vertex to include any vertex 
V e whose immediate successor in L^ belongs to W. The correctness argument 

would apply without modifications. However, care is needed to obtain an efficient imple- 
mentation. 

If we find a 5-augmenting path, the algorithm augments 5 units of flow along the path, 
which effectively increases z~{V) by S. We also compute an expression for x as a convex 
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combination of at most n affinely independent extreme bases i/i, chosen from the current 
?/j's. This computation is a standard hnear programming technique of transforming 
feasible solutions into basic feasible solutions. If the set of extreme points are not affinely 
independent, there is a set of coefficients /Xj for i G / that is not identically zero and 
satisfies J2 l^iUi = and J2f^i = 0. Using Gaussian elimination, we can start computing 
such fii until a dependency is detected. At this point, we eliminate the dependency by 
computing 9 := minjAj/yUj \ > 0} and update Aj := Aj — 9fii for i E I. At least 
one i E I satisfies Aj = 0. Delete such i from /. We continue this procedure until we 
eventually obtain affine independence. Since a new index k is added to / only as a result 
of a nonsaturating push, |/| < 2n after finding an augmenting path. The bottleneck in 
this procedure is the time spent computing the coefficients /i,, which is O(n^) overall. 

A 5-scaling phase ends when either S* = 0, T = 0, or Z = 0. In the last case, we have 
a set of vertices W C V that are reachable from S in G{ip) such that fl T = 0. 

Lemma 3.1: If Z = ^, then W is tight for x. 

Proof. If Z = 0, for each z G / the first \W\ vertices in Lj must belong to W . Then it 
follows from ( |2.3| ) that ViiW) = fiW). Since x = \yi and Y^i^i \ = 1, this implies 
x{W) = Y..^i\iyi{W) = f{W). ■ 

3.3. Correctness and Complexity 

We now investigate the number of iterations in each 5-scaling phase. To do this, we prove 
relaxed weak and strong dualities. The next lemma shows a relaxed weak duality. 

Lemma 3.2: For any base x G B(/) and any 6-feasible flow (f, the vector z = x — dip 
satisfies z-{V) < /(X) + for anyXCV. 

Proof For any X C V we have x{X) < /(X) and dip{X) > -{f)^, and hence z'{V) < 
^(X)</(X) + 05. ■ 
A relaxed strong duality is given as follows. 

Lemma 3.3: At the end of each 6-scaling phase, the following (i)-(iii) hold for x and 
z = X — dip. 

(i) IfS = ^, then x-{V) > f{<i))-n^5 and z-{V) > f{(ll)~n6. 

(ii) IfT = ^, then x-{V) > f{V)-n^6 and z'{V) > f{V)-n6. 

(in) IfW IS tight for x, then x~{V) > f{W) - n^6 and z-{V) > f{W) - n5. 
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Proof. When the 5-scahng phase finishes with 5 = 0, we have x{v) > d(f{v) — 6 > —n6 for 
every v & V, which imphes x~{V) > /(0) — 11^6 as well as z^{V) > /(0) — n6. Similarly, 
when the (5-scaling phase finishes with T = 0, we have x{v) < d(f{v) + 6 < n6 for every 
V & V, which implies x~{V) > x{V) — n^S = f{V) — as well as z''{V) > x{V) — n6. 

When the (5-scaling phase ends with x{W) = f{W) due to Lemma 3.1, then C 11/ C 
V\T and dip{W) < 0. By the definitions of 5* and T, we also have x{v) > dip{v) — 6 > 
—n6 for every v G V"\W/ and x{v) < dip{v) + 5 < n5 for every v G W . Therefore we 
have x'iy) = x-{W) + x-{V\W) > x{W) - n6\W\ - n6\V\W\ = f{W) - as well 
as z-{V) = z-{W) + z~{V\W) > x{W) - dip{W) - \W\5 - 5\V\W\ > f {W) - n5. ■ 



Lemma |3.3| implies that at the beginning of the (5-scaling phase, after S is cut in half, 
z~(y) is at least f{X)—2n6 for some X CV. Making the current fiow ^-feasible decreases 
z~{V) by at most (2)'^- Each 5-augmentation increases z~{V) by 6. Since z~{V) is at 
most f{X) + (^) 5 at the end of a 5-phase by Lemma |3]^ the number of 5- augment at ions 



per phase is at most + n for all phases after the first. Since z~{V) = x~{V) > —nM 
at the start of the algorithm, setting the initial 6 = M is more than sufficient obtain a 
similar bound on the number of augmentations in the first phase. 



As an immediate consequence of Lemmas p.2| and p.3|, we also obtain the following. 



Theorem 3.4: The algorithm obtains a minimizer of f at the end of the 6-scaling phase 
with 6 < 1 /n^ . 



Proof. By Lemma 3^, the output X of the algorithm satisfies x {V) > f{X) — > 



f{X) — 1. For any Y CV, the weak duality in Lemma |2.2| asserts x (V) < f{Y). Thus 
we have f{X) — 1 < f{Y), which implies by the integrality of / that X minimizes /. ■ 



Theorem 3.5: Algorithm SFM runs in 0(n^log(nM)) time. 

Proof. The algorithm starts with 6 = M and ends with 6 < so the algorithm 

consists of 0(log(nM)) scaling phases. Each scaling phase finds 0(?t,^) 5-augmenting 
paths. To find an augmenting path, we perform at most O(n^) pushes per extreme base. 
A saturating push requires 0(1) time while a nonsaturating one 0(n) time. Here, note 
that there are less than n nonsaturating pushes per augmenting path. Hence, the time 
spent in pushes per augmenting path is 0{n^). After each augmentation, we also update 
the expression x = Y^i^i Xii/i, which also takes O(n^) time per augmentation. Thus the 
overall complexity of SFM is 0(n^ log(nM)). ■ 

In this section, we have shown a weakly polynomial-time algorithm for minimizing 
integer-valued submodular functions. The integrality of a submodular function / guar- 
antees that if we have a base x G B(/) and a subset X of V such that the duality gap 
f{X) — x~{V) is less than one, X is a minimizer of /. Except for this we have not used 
the integrality of /. It follows that for any real-valued submodular function / : 2^ — > R, 
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if we are given a positive lower bound e for the difference between the second minimum 
and the minimum value of /, the present algorithm works for the submodular function 
(1/e)/ and runs in 0(?2^ log(nM/e)) time, where M is an upper bound on among 
X C 1/. 

4. A Strongly Polynomial- Time Algorithm 

This section presents a strongly polynomial-time algorithm for minimizing submodular 
functions using the scaling algorithm in Section |^. The new algorithm exploits the 
following proximity lemma. 

Lemma 4.1: At the end of the 6-scaUng phase, if x{w) < —n?6, then w belongs to every 
minimizer of f . 

Proof. Let X be any minimizer of /. There exists a vector y G B(/) with < y~ 
such that y^iy) = f{X). Note that y{v) > for each v G V\X. By Lemma \i.'S\ , there 
exists a subset Y ^ V such that x^iV) > f{Y) — n^5. Then we have y^{w) — x^{w) < 
y~{y)—x~ iy) < f{X) — f(Y)+n'^6 <n'^6. This implies < due to the assumption, 
and hence w E X. ■ 

Let / : 2^ — i> R be a submodular function and x G B(/) an extreme base whose 
components are bounded from above by ?7 > 0. Assume that there exists a subset Y CV 
such that f(Y) < —k for some positive parameter k, which will be specified later as ri/2. 
We then apply the scaling algorithm starting with 6 = t] and the extreme base x G B(/). 
After \\og2{n^ri / K,)] scaling phases, S becomes less than n/n^. Since x{Y) < f{Y) < —k, 
at least one element w & Y satisfies x{w) < —n^5. By Lemma such an element w 
belongs to every minimizer of /. We denote this procedure by Fix(/, x, 77). 

We now discuss how to apply this procedure to design a strongly polynomial-time 
algorithm for minimizing a submodular function /. If fiV) > 0, we replace the value 
f{V) by zero. The set of minimizers remains the same unless the minimum value is zero, 
in which case we may assert that minimizes /. 

An ordered pair (m, v) of distinct vertices u,v E V is said to be compatible with / if 
u (z X implies f G X for every minimizer X of /. Our algorithm keeps a directed acyclic 
graph D = {V, F) whose arcs are compatible with /. Initially, the arc set F is empty. 
Each time the algorithm finds a compatible pair {u, v) with /, it adds {u, v) to F. When 
this gives rise to a cycle in D, the algorithm contracts the strongly connected component 
U ^ V to a single vertex and modifies the submodular function / by regarding f/ as a 
singleton. 

For each v E V, let R{v) denote the set of vertices reachable from v in D and /„ the 
submodular function on the subsets of V\R{v) defined by 

= fix U Riv)) - fiRiv)) (X C V\Riv)). 
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A linear ordering (f i, ■ ■ ■ , f„) of V is called consistent with D if i < j implies {vi, vj) ^ 
F. Consider an extreme base x G B(/) generated by a linear ordering {vi,V2, ■ ■ ■ ,Vn) 
consistent with D. The extreme base generated by a consistent linear ordering is also 
called consistent. For any consistent extreme base x G B(/), the greedy algorithm defines 
x{v) by x{v) = f{U) — f{U\{v}) for some f/ C \/ with R{v) C U. It then follows from 
the submodularity of / that x satisfies x{v) < f{R{v)) — f{R{v)\{v}) for each v & V. 
In each iteration, the algorithm computes 

V = max{/(i?(t;)) - f{R{v)\{v}) \ v G V}. (4.1) 

If ?7 < 0, then an extreme base x G B(/) consistent with D satisfies x{v) < for each 
V ^ V. In this case x~{V) = x{V) = f{V), which implies that V minimizes / by 
Lemma p.2| . If in addition f{V) = 0, then the original function may have had a positive 
value of f{V). Therefore, the algorithm returns or as a minimizer, according to 
whether f{V) = or f{V) < 0. 

If ?7 > 0, let u be an element that attains the maximum in the right-hand side of ( [4.1| ). 



Then we have f{R{u)) = f{R{u)\{u}) + rj, which implies either f{R{u)) > ri/2 > or 
f{R{u)\{u}) < -T]/2 < holds. 

In the former case > r//2), we have fuiV\R{u)) = f{V) - < 

—i]/2. The algorithm finds a consistent extreme base x G B{fu) generated by a linear 
ordering {vi,---,Vk) consistent with D, where k = \V\R{u)\. That is, let x{vi) = 
fu{{vi}) and x{vj) = fu{{vi,V2, . . . ,Vj}) - fu{{vi,V2, . . . ,Vj_i}) for j = 2,...,k. Then 
the extreme base x G B{fu) satisfies x{v) < f{R{v)) — f{R{v)\{v}) < rj. Thus we may 
apply the procedure Fix(/„,x, 77) to find an element w G V\R{u) that belongs to every 
minimizer of fu. Since k = r]/2, the procedure terminates within O(logn) scaling phases. 
Consequently, we obtain a new pair {u, w) that is compatible with /. Hence the algorithm 
adds the arc (m, w) to F. 

In the latter case {f{R{u)\{u}) < —77/2), we compute an extreme base x G B(/) 
consistent with D by the greedy algorithm, and then apply the procedure Fix(/, x, rj) to 
find an element w G R{u) that belongs to every minimizer of /. Since x{v) < rj for 
every v E V and k, = ri/2, the procedure terminates within O(logn) scaling phases. Note 
that every minimizer of / includes R{w). Thus it suffices to minimize the submodular 
function which is now defined on a smaller underlying set. Figure |^ provides a formal 
description of the strongly polynomial-time algorithm. 

Theorem 4.2: The algorithm in Figure ^ computes the minimizer of a submodular func- 
tion in 0{n^ logn) time, which is strongly polynomial. 

Proof. Each time we call the procedure Fix, the algorithm adds a new arc to D or deletes 
a set of vertices. This can happen at most n"^ times. Thus the overall running time of 
the algorithm is 0(r;,''logn), which is strongly polynomial. ■ 
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Input : / : 2^ R 

Output : X <ZV minimizing / 

Initialization: 

X ^ 

While ^ do 

Iff{V) > then f{V) ^ 

rj ^ max{f{R{v)) - f{R{v)\{v}) \veV} 

li 1] < then break 

Let u E V attain the maximum above. 

li f{R{u)) > rj/2 then 

Find a consistent extreme base x e B(/„) by the greedy algorithm. 

w ^ F\x{f^,x,'q) 

Hue R{w) then 

Contract {v \ v E R{w), u G R{v)} to a single vertex. 

Else F ^ FU 
Else 

Find a consistent extreme base x e B(/) by the greedy algorithm. 
w ^ Fix(/, ,T, rj) 
V <- V\R{w) 

f ^ fw 

Find a subset Q of the original underlying set represented by R{w). 
X ^XLiQ 
If f{V) < then 

Find a subset Q of the original underlying set represented by V. 
X ^XUQ 

End. 



Figure 3: A strongly polynomial-time algorithm for submodular function minimization. 
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5. Concluding Remarks 



This paper presents a strongly polynomial-time algorithm for minimizing submodular 
functions defined on Boolean lattices. We now briefly discuss minimizing submodular 
functions defined on more general lattices. 

Consider a submodular function f : V ^ H defined on a distributive lattice V 
represented by a poset V on V. Then the associated base polyhedron is unbounded in 
general (see |]12|). 



An easy way to minimize such a function / is to consider the reduction of / by a 
sufficiently large vector. As described in |]12|, p. 56], we can compute an upper bound 
M on 1/(^)1 {X e Let /' be the rank function of the reduction by a vector with 
each component being equal to M. The submodular function /' is defined on 2^ and 
the set of minimizers of /' coincides with that of /. Thus, we may apply our algorithms. 
However, each evaluation of the function value of /' requires O(n^) elementary operations 
in addition to a single call for the evaluation of /. Consequently, this approach takes 
0(n^ min{log(?iM), logn}) time. 

Alternatively, we can slightly extend the algorithms in Sections |^ and § by keep- 
ing the base x G B(/) as a convex combination of extreme bases ?/j's plus a vector in 
the characteristic cone of B(/). The latter can be represented as a boundary of a non- 
negative flow in the Hasse diagram of V. This extension enables us to minimize / in 
0(n^ min{log(nM), logn}) time. 

Submodular functions defined on modular lattices naturally arise in linear algebra. 
Minimization of such functions has a significant application to computing canonical forms 
of partitioned matrices ||TB|, EO]. It remains an interesting open problem to develop an 



efficient algorithm for minimizing submodular functions on modular lattices, even for 
those specific functions that arise from partitioned matrices. 

Independently of this work, and almost simultaneously, Schrijver has also developed 
a combinatorial, strongly polynomial-time algorithm for submodular function minimiza- 
tion His algorithm also extends Cunningham's approach. However, the resulting 
algorithm is quite different from ours. 

Acknowledgments 

We are grateful to Bill Cunningham, Michel Goemans, and Maiko Shigeno for their useful 
comments. 

References 



15 



[1] R. E. Bixby, W. H. Cunningham, and D. M. Topkis: Partial order of a polymatroid 
extreme point, Math. Oper. Res., 10 (1985), 367-378. 

[2] W. H. Cunningham: Testing membership in matroid polyhedra, J. Combinatorial 
Theory, B36 (1984), 161-188. 

[3] W. H. Cunningham: On submodular function minimization, Combinatorica, 5 
(1985), 185-192. 

[4] J. Edmonds: Submodular functions, matroids, and certain polyhedra, Combinatorial 
Structures and Their Applications, R. Guy, H. Hanani, N. Sauer, and J. Schonheim, 
eds., Gordon and Breach, 69-87, 1970. 

[5] J. Edmonds and R. Giles: A min-max relation for submodular function on graphs, 
Ann. Discrete Math., 1 (1977), 185-204. 

[6] J. Edmonds and R. Karp: Theoretical improvements in algorithmic efficiency for 
network flow problems, J. ACM, 19 (1972), 248-264. 

[7] L. Fleischer, S. Iwata, and S. T. McCormick: A faster capacity scahng algorithm for 
submodular flow, 1999. 

[8] A. Prank and E. Tardos: An application of simultaneous Diophantine approximation 
in combinatorial optimization, Combinatorica, 7 (1987), 49-65. 

[9] S. Pujishige: Polymatroidal dependence structure of a set of random variables. In- 
formation and Control, 39 (1978), 55-72. 

[10] S. Pujishige: Lexicographically optimal base of a polymatroid with respect to a 
weight vector. Math. Oper. Res., 5 (1980), 186-196. 

[11] S. Pujishige: Submodular systems and related topics. Math. Programming Study, 22 
(1984), 113-131. 

[12] S. Pujishige: Submodular Functions and Optimization, North- Holland, 1991. 

[13] M. X. Goemans and V. S. Ramakrishnan: Minimizing submodular functions over 
famihes of subsets, Combinatorica, 15 (1995), 499-513. 

[14] M. Grotschel, L. Lovasz, and A. Schrijver: The ellipsoid method and its consequences 
in combinatorial optimization, Combinatorica, 1 (1981), 169-197. 

[15] M. Grotschel, L. Lovasz, and A. Schrijver: Geometric Algorithms and Combinatorial 
Optimization, Springer- Verlag, 1988. 



16 



[16] T.-S. Han: The capacity region of general multiple- access channel with correlated 
sources, Information and Control, 40 (1979), 37-60. 

[17] B. Hoppe and E. Tardos: The quickest transshipment problem. Proceedings of 5th 
ACM/SIAM Symposium on Discrete Algorithms (1995), 512-521. 

[18] H. Ito, S. Iwata, and K. Murota: Block-triangularization of partitioned matrices un- 
der similarity/equivalence transformations, SIAM J. Matrix Anal. AppL, 15 (1994), 
1226-1255. 

[19] S. Iwata: A capacity scaling algorithm for convex cost submodular flows. Math. 
Programming, 76 (1997), 299-308. 

[20] S. Iwata and K. Murota: A minimax theorem and a Dulmage-Mendelsohn type 
decomposition for a class of generic partitioned matrices, SIAM J. Matrix Anal. 
AppL, 16 (1995), 719-734. 

[21] L. Lovasz: Submodular functions and convexity. Mathematical Programming — The 
State of the Art, A. Bachem, M. Grotschel and B. Korte, eds.. Springer- Verlag, 1983, 
235-257. 

[22] H. Nagamochi and T. Ibaraki: Computing edge- connectivity in multigraphs and 
capacitated graphs, SIAM J. Discrete Math., 5 (1992), 54-64. 

[23] H. Narayanan: A rounding technique for the polymatroid membership problem. 
Linear Algebra AppL, 221 (1995), 41-57. 

[24] M. Queyranne: Minimizing symmetric submodular functions. Math. Programming, 
82 (1998), 3-12. 

[25] A. Schrijver: A combinatorial algorithm minimizing submodular functions in 
strongly polynomial time, 1999. 

[26] M. A. Sohoni: Membership in submodular and other polyhedra. Technical Report 
TR-102-92, Department of Computer Science and Engineering, Indian Institute of 
Technology, Bombay, India, 1992. 

[27] A. Tamir: A unifying location model on tree graphs based on submodularity prop- 
erties. Discrete AppL Math., 47 (1993), 275-283. 



17 



